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Abstract 

The problem of finding a vector with the fewest nonzero elements that satisfies an underdeter- 
mined system of linear equations is an NP-complete problem that is typically solved numeri- 
cally via convex heuristics or nicely-behaved nonconvex relaxations. In this paper we consider 
the elementary method of alternating projections (MAP) for solving the sparsity optimization 
problem without employing convex heuristics. In a parallel paper we recently introduced the 
restricted normal cone which generalizes the classical Mordukhovich normal cone and recon- 
ciles some fundamental gaps in the theory of sufficient conditions for local linear convergence 
of the MAP algorithm. We use the restricted normal cone together with the notion of superreg- 
ularity, which is naturally satisfied for the affine sparse optimization problem, to obtain local 
linear convergence results with estimates for the radius of convergence of the MAP algorithm 
applied to sparsity optimization with an affine constraint. 
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1 Introduction 



We consider the problem of sparsity optimization with affine constraints: 

(1) minimize ||x||o subject to Mx = p 

where m and n are integers such that 1 < m < n, M is a real m-by-n matrix, denoted M G R m x 
and ||x||o := YIj=\ I s g n ( x /)l counts^ the number of nonzero entries of real vectors x of length n, 
denoted by x G R". 

If there is some a priori bound on the desired sparsity of the solution, represented by an integer 
s, where 1 < s < n, then one can relax (0 to the feasibility problem 

(2) find c G A n B, 
where 

(3) A := {x G R" | ||x|| < s} and B := {x G R" | Mx = p}. 
The sparsity subspace associated with a = (a\, . . . ,a n ) G R" is 

(4) supp(fl) := {x G R" | Xj = whenever fl ; = 0}. 
Also, we define 

(5) I:TR n ^{l,...,n}:x^{ie{l n}\ x ; ^ 0}, 

and we denote the z standard unit vector by e, for every i G {1, . . . ,n}. 

Problem is in general NP-complete |2T| and so convex and nonconvex relaxations are typ- 
ically employed for its solution. For a primal-dual convex strategy see [SJ; for relaxations to 
£ p (0 < p < 1) see il5| ; see El for a comprehensive review and applications. In this paper 
we apply recent tools developed by the authors in (3J to prove local linear convergence of an el- 
ementary algorithm applied to the feasibility formulation of the problem ([2]), that is, we do not 
use convex heuristics or conventional smooth relaxations. The key to our results is a new normal 
cone called the restricted normal cone. A central feature of our approach is the decomposition of the 
original nonconvex set into collections of simpler (indeed, linear) sets which can be treated sepa- 
rately. Ours is not the first result on local linear convergence for sparsity optimization with affine 
constraints. Indeed the problem was considered more than twenty years ago by Combettes and 
Trussell who show local convergence of alternating projections [11]. The problem was recently 
used to illustrate the application of analytical tools developed in (T7| and Ifl8ll . Other approaches 
that also yield convergence results for different algorithms can be found in [1] and (5), with the 
latter of these being notable in that they obtain global convergence results with additional assump- 
tions (restricted isometry) that we do not consider here. The novelty of the results we report here, 
based principally on the works fT7\ , lTl6l and 0, is that we obtain not only convergence rates but 
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also radii of convergence when all conventional sufficient conditions for local linear convergence, 
notably those of [17] and Ifl6| , fail. In this sense, our criteria for convergence are more robust and 
yield richer information than other available notions. 

The remainder of the paper is organized as follows. In Section|H we define the restricted normal 
cones and corresponding constraint qualifications for sets and collections of sets first introduced 
in (3) as well as the notion of superregularity introduced in [16J adapted to the restricted normal 
cones. A few of the many properties of these objects developed in |3j are restated in preparation 
for Section [3] where we apply these tools to a convergence analysis of the method of alternating 
projections (MAP) for the problem of finding a vector c G R" satisfying an affine constraint and 
having sparsity no greater than some a priori bound, that is, we solve ((2]) for A and B defined by 
(0. Given a starting point G X, MAP sequences (a k ) k ^ an d (&/ c )jceN are generated as follows: 

(6) (VfcGN) a k :=P A b k _ lr b k := P B a k . 

We do not attempt to review the history of the MAP, its many extensions, and its rich and conver- 
gence theory; the interested reader is referred to, e.g., [2J, lTT0|l , |T2| , and the references therein. We 
consider the MAP iteration to be a prototype for more sophisticated approaches, both of projection 
type or more generally subgradient algorithms, hence our focus on this simple algorithm. 

Notation 

Our notation is standard and follows largely 111,171, l20l , 1122)1 , and H23) to which the reader is 
referred for more background on variational analysis. Throughout this paper, we assume that 
X = R" with inner product (•,•), induced norm || • ||, and induced metric d. The real numbers 
are R, the integers are Z, and N := {z G Z | z > 0}. Further, R + := {x G R | x > 0}, R++ := 
{x G R I x > 0}. Let R and S be subsets of X. Then the closure of S is S, the interior of S is 
int(S), the boundary of S is bdry(S), and the smallest affine and linear subspaces containing S 
are aff S and span S, respectively. If Y is an affine subspace of X, then par Y is the unique linear 
subspace parallel to Y. The negative polar cone of S is S e = {u G X | sup (u, S) < 0}. We also set 
S® := -S e andS^ := S* nS e . We also write R S for R + S := {r + s | (r,s) eRxS} provided 
that R _L S, i.e., (V(r,s) G R x S) (r,s) = 0. We write F: X =4 X, if F is a mapping from X to its 
power set, i.e., gr F, the graph of F, lies in X x X. Abusing notation slightly, we will write F(x) — y 
if F(x) = {y}. A nonempty subset K of X is a cone if (VA G R+) \K := {Ak \ k G K} C K. The 
smallest cone containing S is denoted cone(S); thus, cone (S) := R+ ■ S := {ps \ p G R+,s G S} if 
S ^ andcone(0) := {0}. Ifz G X and p G R++, thenball(z;,o) := {x G X | d(z,x) < p} is the 
closed ball centered at z with radius p while sphere(z;p) := {x G X | d(z,x) = p] is the (closed) 
sphere centered at z with radius p. If u and v are in X, then [m, y] := { (1 — \)u + Av | A G [0, 1] } 
is the line segment connecting u and 
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2 Foundations 



We review in this section some of the fundamental tools used in the analysis of projection algo- 
rithms, and in particular MAP, for the solution of feasibility problems like (]2]). The tools below 
are intended for more general situations where the sets A and B might admit decompositions into 
unions of sets, in which case we consider the feasibility problem 

(7) find cG (U A 'WlM\) 

Central to the convergence analysis of the MAP algorithm for solving (0 is the notion of regu- 
larity of the intersection and the regularity of neighborhoods of the intersection. These ideas are 
developed in detail in [3J. We review the main points relevant to our application here. 

Normal cones are used to provide information about the orientation and local geometry of 
subsets of X. There are many species of normal cones, the key ones for our purposes are defined 
here. In addition to the classical notions {proximal, Frechet, Mordukhovich) we define the restricted 
normal cone introduced and developed in )3l . 

Definition 2.1 (normal cones) Let A and B be nonempty subsets ofX, and let a and u be in X. If a £ A, 

then various normal cones of A at a are defined as follows: 

(i) The B-restricted proximal normal cone of A at a is 

(8) N%(a) := cone ((B DP^a) - a) = cone((B-a) n (P A 1 fl-fl)). 

(ii) The (classical) proximal normal cone of A at a is 

(9) <°» := Nf(fl) = cone (P A l a - a) . 

(iii) The B-restricted normal cone N^(fl) is implicitly defined by u G N^(fl) if and only if there exist 
sequences (fljc)jcg]N in A and (wjt)iteN in N^(ajt) such that a^ — > a and u/ c — > u. 

(iv) The Frechet normal cone N^ e (a) is implicitly defined by u G N^ re (a) if and only if (Ve > 0) 
{35 > 0) (Vac G Anball(fl;<y)) (u,x-a) < e||x-a||. 

(v) The convex normal from convex analysis N^ onv (fl) is implicitly defined by u G N A onv (a) if and 
only if sup (u, A — a) < 0. 

(vi) The Mordukhovich normal cone Na(a) of A at a is implicitly defined by u G Na(o) if and only if 
there exist sequences (flfc)fceN zn A and (Mfc)jteN zn ° x (flfc) such that —> a and — > u. 

If a £ A, then all normal cones are defined to be empty. 

The following elementary calculus rules are a restatement of [3, Proposition 3.7]. 
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Proposition 2.2 Let A, A\, Ai, B, B\, and Bi be nonempty subsets of X, let c G X, and suppose that 
a E A n A\ n A2. Then the following hold: 

(i) If A and B are convex, then N A (a) is convex. 

(ii) N B A ^(a) =N}(a)UN B /(a)andN B A ^(a) = N^(a) U N B / (a) . 

(iii) IfBQA, then N B (a) = N B (a) = {0}. 

(iv) If A x C A 2 , then N^a) C N^(fl). 

(v) -N*(«) = NTl(-a), -N|(a) = AT^-a), and -N A («) = N_ A (-a). 

(vi) = N^(a - c) and N*(a) = N B A Z c c (a - c). 

The constraint qualification-, or CQ-number defined next is built upon the normal cone and quan- 
tifies classical notions of constraint qualifications for set intersections that indicate sufficient reg- 
ularity of the intersection. 

Definition 2.3 ((joint) CQ-number) Let A, A, B, B, be nonempty subsets of X, let c E X, and let 
S G The CQ-number at c associated with (A, A, B, B) and 5 is 

(10) 6s := 9 S (A,A,B,B) := sup \{u,v) u G G H"" ^ H < U 

I ||a — c|| < <5, — c|| < J. J 

The limiting CQ-number at c associated with (A,A,B,B) is 

(11) := 6(A,A,B,B) := lim6 s (A, A,B,B). 

S-10 

For nontrivial collection^ A := (Ai)f e j, A := (A,)^/, £> := := (Bj)jEj of nonempty subsets 

ofX, the joint-CQ-number at c EX associated with {A, A, B, B) and 5 > Ois 

(12) 6 S = 6 S (A, A, B,B):= sup e s {A ir Ai, B jr Bj) , 

(w)elx/ 

and the limiting joint-CQ-number at c associated with [A, A,B,B) is 

(13) = 6{A,A,B,B) := limO A A, A, B,B). 

The CQ-number is obviously an instance of the joint-CQ-number when I and / are singletons. 
When the arguments are clear from the context we will simply write 6s and 9. 



2 The collection (A,), e j is said to be nontrivial if 7 7^ 
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Using Proposition!^ fvi) we see that, for every x G X, 



(14) 9s (A, A, B, B) at c = 9 S (A - x,A - x,B - x,B - x) at c - x. 

The CQ-number is based on the behavior of the restricted proximal normal cone in a neigh- 
borhood of a given point. A related notion is that of the exact CQ-number, defined next, which 
is based on the restricted normal cone at the point instead of nearby restricted proximal normal 
cones. In both instances, the important case to consider is when c G A n B (or when c 6 A, H Bj in 
the joint-CQ case). 

Definition 2.4 (exact CQ-number and exact joint-CQ-number) Let c G X. 

(i) Let A, A, B and B be nonempty subsets of X. The exact CQ-number at c associated with 
(A, A, B, B) is 

(15) oc := a{A,A,B,B) := sup l(u,v) u G N%(c),v G -N§(c),\\u\\ < 1, \\v\\ < 1 j. 

where we define a = -co in the case that c £ An B which is consistent with the convention 
sup = —oo. 

(ii) Let A := (Aj)j G i, A := (A,), e j, B := (Bj)j G j and B := (B;);e/ be nontrivial collections of 
nonempty subsets ofX. The exact joint-CQ-number at c associated with (A, B, A, B) is 

(16) a := cc(A, A, B, B) := sup a(A ir A ir B jr Bj). 

(y)eix/ 

The next result, which we quote from Theorem 7.8], establishes relationships between the 
condition numbers defined above. 

Theorem 2.5 Let A := {Ai) iGl , A := {Ai) iGlr B := (B/);e/ an & & := be nontrivial collections 

of nonempty subsets ofX. Set A := \J iGl A\ and B := |J, G j Bj, and suppose that c G A n B. Denote the 

exact joint-CQ-number at c associated with {A, A, B, B) by a, the joint-CQ-number at c associated with 
{A, A, B, B) and 5 > Oby 9s, and the limiting joint-CQ-number at c associated with (A, A,B,B) by 9. 
Then the following hold: 

(i) If a < 1, then the (A,A,B,B)-CQ condition holds at c. 

(ii) a < 9 S . 

(iii) a < 9. 

If in addition I and J are finite, then the following hold: 

(iv) a = 9. 
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(v) The (A,A,B,B)-joint-CQ condition holds at c if and only if a = 6 < 1. 

The CQ-number is related to the angle of intersection of the sets. The case of linear subspaces 
underscores the subtleties of this idea and illustrates the connection between the CQ-number and 
the correct notion of an angle of intersection. The Friedrichs angle f\M (or simply the angle) between 
subspaces A and B is the number in [0, 5] whose cosine is given by 

(17) c(A,B) := sup {| (a,b) | | a £ AD (AD B) L ,b £ BD (AnB) 1 , ||fl|| < 1, < l}, 

and we set c(A,B) := c(par A,par£>) if A and B are two intersecting affine subspaces of X. The 
following result is a consolidation of |3j Theorem 8.12 and Corollary 8.13]. 

Theorem 2.6 (CQ-number of two (affine) subspaces and Friedrichs angle) Let A and B be affine 
subspaces ofX, and let 3 > 0. Then 

(18) e s (A,A,B,B) = 6 S (A,X,B,B) = 9 S (A,A,B,X) = c(A,B) < 1, 
where the CQ-number at is defined as in (TO)) . 

Moreover, if A and B are affine subspaces ofX with c £ A n B, and 5 > 0, then (fl8|) holds at c. 

An easy consequence of Theorem l2.6l is the case of two distinct lines through the origin for which 
the CQ-number is simply the cosine of the angle between them ([3, Proposition 7.3]). 

Corollary 2.7 (two distinct lines through the origin) Suppose that w a and W\, are two vectors in X 
such that \\w a \\ = \\wi,\\ = 1. Let A := Miv a , B := IRuty, and 5 > 0. Assume that A n B — {0}. Then 
the CQ-number at is 

(19) S (A,A,B,B) = 6 S {A,X,B,B) = 6 S {A,A,B,X) = c{A,B) = \ (w a ,w h ) \ < 1. 

Convergence of MAP requires also a certain regularity on neighborhoods of the corresponding 
fixed points. For this we used a notion of regularity of the sets that is an adaptation to restricted 
normal cones of type of regularity introduced in Ifl6| . 

Definition 2.8 ((joint-) regularity and (joint-) superregularity) Let A and B be nonempty subsets of 
X, let B := {Bj)j G j be a nontrivial collection of nonempty subsets ofX, and let c £ X. 

(i) We say that B is (A, e,S) -regular at c £ X if e > 0, 5 > 0, and 

(y,b)£BxB, ^ 
(20) \\y - c\\ < 5,\\b - c\\ < 5, \ (u,y - b) < e\\u\\ ■ \\y - b\\. 

u£N£(b) J 

If B is (X, e, 5)-regular at c, then we also simply speak of (e, S) -regularity. 
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(ii) The set B is called A-superregular at c G X if for every e > there exists 5 > such that B is 
(A, e,S)-regular at c. Again, ifB is X-superregular at c, then we also say that B is superregular at c. 

(iii) We say that B is (A, e, S) -joint-regular at cife>0,S> 0, and for every j G /, Bj is (A, e, 5)-regular 
at c. 

(iv) The collection B is A-joint-superregular at c if for every j G /, Bj is A-superregular at c. We omit 
the prefix A if A = X. 

Joint-(super Regularity can be easily checked by any of the following conditions. 

Proposition 2.9 Let A := (Aj)j G j and B := {Bj)j e j be nontrivial collections of nonempty subsets ofX, 
let c G X, let ( £/)/'£/ be a collection in IR +/ and let ($j)jej be a collection in ]0,+oo]. Set A := f\j e jAj, 
e := sup- e j Ej, and 5 := infy e j 5j. Then the following hold: 

(i) IfS>0 and (V; G /) Bj is (Aj, bf)-regular at c, then B is (A, e, 6) -joint-regular at c. 

(ii) If J is finite and (V; G /) Bj is (Aj, Ej, Sj)-regular at c, then B is (A, e, 5)-joint-regular at c. 

(iii) If J is finite and (V; G /) Bj is Aj-superregular at c, then B is A-joint-superregular at c. 

If in addition B := (Bj)j G j is a nontrivial collection of nonempty convex subsets of X then, for A <ZX,B 
is (0, +oo) -joint-regular, (A,0, +oo) -joint-regular, joint-superregular, and A-joint-superregular at c G X. 



The framework of restricted normal cones allows for a great deal of flexibility in how one de- 
composes problems. Whatever the chosen decomposition, the following properties will be re- 
quired. 





' A := (Ai)i e i and B := are nontrivial collections 




of nonempty closed subsets of X; 




A := |J Aj and B := (J Bj are closed; 




iel jej 




c G A n B; 


(21) 


A := (Ai)i£i and B := {Bj)j e j are collections 




of nonempty subsets of X such that 




(Vz G I) P A .((bdryB) \ A) C A ir 




(V;G/) P B/ ((bdryA)\B) CB j} 




A := U At and B:=\J Bj. 




iel jej 



With the above assumptions one can establish rates of convergence for the MAP algorithms. 
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Theorem 2.10 (convergence rate, Corollary 10.8 of 0) Assume that (|2T)| holds and that there exists 
5 > such that 

(i) A is (B, 0,35) -joint-regular at c; 

(ii) B is (A, 0, 35) -joint-regular at c; and 

(iii) 9 < 1, where 9 := 9 3 s is the joint-CQ-number at c associated with (A, A,B ,B) (see Definition ^. 3h 

Suppose also that the starting point of the MAP b_\ satisfies — c\\ < ^-9) • Then (fl/OiceN an d 
(pk)k<Eisi converge linearly to some point in c £ ADB with \\c — c\\ < 5 and rate 9 2 ; in fact, 

(22) (Vfc>l) max{|K-c-||,||^-c-||} < ^(e 2 )^ 1 . 



3 Sparse feasibility with an affine constraint 



We now move to the application of feasibility with a sparsity set and an affine subspace, problem 
((2]). Our main result on the convergence of MAP is given in Theorem (|3.19[) . Along the way we 
develop explicit representations of the projections, normal cones, and tangent cones to the sparsity 
set 10) and motivate our decomposition of the problem. 



Properties of sparsity sets 

Lemma 3.1 Let x and y he in IR' 1 , and let A S R. Then the following hold: 

(i) supp(x) = spanje, | i E I(x)} and \\x\\q = card(7(x)) = dimsupp(x). 

(ii) x £ supp(y) O I(x) C I(y) supp(x) C supp(y) => ||x||o < ||y||o- 



(iii) I(x + y) C I(x) U I(y) and I (Ax) 



I(x), ifA^Q; 
0, otherwise. 



(iv) I((l-A)x + Ay) C I(x)Ul(y). 

(v) supp(Ax) = Asupp(x) and \\Ax\\o = |sgn(A)| • ||x||o. 

(vi) supp(x + y) C supp(x) +supp(y) and ||x + y|| < ||x|| + ||y||o- 

(vii) ]/"supp(x) C supp(y) and z G supp(y), then there exist u and v in R" such that z = u + v, 
u £ supp(x) and \\v\\q < \\y\\o — ||x||o. 

(viii) Let 5 £ ]0,min { |x,| | / £ I(x)} [ and y £ x + [—5, +5] n , then supp(x) C supp(y). 
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(ix) Ifl(x) <£ I(y) and I(y) £ I(x), then 



(23) ||x + y|| > min \xA + min |v,'| > min \xA + min |y 

i€l(*)vT(y) ;el(y)xl(*)'* 71 «el(*)' jel{y) ' 

(x) || • || o is lower semicontinuous. 
Proof. [(I)] - }(v)t These follow readily from the definitions. 



(vi)i Bygjjj} J(x + y) C I(x) U I(y). Hence supp(x + y) C supp(x) + supp(y); on the other 
hand, taking cardinality and using [(I)] yields ||x + y||o < ||x||o + IMIo- 



(vii)| By |(ii)t we have I(x) C 7(y). Write 7(y) = I(x)U/ as disjoint union, where / = I(y) \ 7(x), 



and note that that card (/) = card(I(y)) — card(7(x)) = ||y||o — \\x\\q. Thensupp(y) = supp(x) 
spanje, | i G /}. Now since z G supp(y), we can write z = « + v, where u G supp(x) and 
v G span {e; | i G /} and |jy||o < card(J) = ||y||o — ||x||o- 

|(viii)| If i G I(x), then |y,| > |x;| — |x, — y,| > 5 — |x, — y,| > and hence y,- ^ 0. It follows that 
I(x) C 7(y). Now apply p)| 



(ix)| Let io G I(x) \ J(y) and ;'o G 7(y) \ I(x). Then y, = and x; = 0, and hence 

(24a) lk + yll 2 > l^o + y/ol 2 + l^o + y;ol 2 

(24b) > min |x,-| 2 + min \yA 2 

iel(*)\l(y) ;eI(y)Nl(x) 



(24c) > min |x,| + min |y,-| , 

iei(*) 7 

as claimed. 



|(x)[ Indeed, borrowing the notation below, we see that {z G X | ||z||o < p] — U/ej r Aj, where 
r = [fi\, is closed as a union of finitely many (closed) linear subspaces. ■ 

In order to apply Theorem l2.10l to MAP for solving ((2]) we must choose a suitable decomposition, 
A and B, and restrictions, A and B, and verify the assumptions of the theorem. We now abbreviate 

(25a) J := 2* 1 ' 2 »> and J s := J(s) := {j G J | card(J) = s} 

and set 

(25b) (V/ G J) Aj := span {e y | ; G /}. 

Define the collections 

(25c) A := A := (Aj) ]gJs and 6 := B := (B). 

Clearly, 

(25d) A := A := (J A 7 = {x G R n | ||x|| < s} and B = B := {x G X | Mx = p}. 

JeJ s 
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The proofs of the following two results are elementary and thus omitted. 
Proposition 3.2 (properties of Aj) Let J, }\, and J2 be in J , and let x EX. Then the following hold: 



(i) A h U A h C A huk = span(A 7l U A k ). 

(ii) h Q h A h C A k . 

(iii) x E A 1[x) = supp(x). 

(iv) I(x) C J x E Aj. 

(v) I(x) n/ = 0Oi£ Aj-. 

(vi) s < n - 1 <=> intA = 0. 

Proposition 3.3 Let J & J,let x = (xi,. . . ,x n ) EX, and sety := P^.x. TTzen 

(26) (Vfe{l n}) yi = h if \l ] ; 

and 

(27) 4;W= E N 2 = E N 2 - 

The following technical result will be useful later. 
Lemma 3.4 Lef c G A, and assume that s < n — 1. T/jen 

(28) min{d A/ (c) \ c£Aj,JeJ s } =mm{|c ; -| | ; e 1(c)}. 



Proof. First, let / G J s such that c G" Aj <=> 1(c) £ / by Proposition|3j;iv)| So 1(c) \ / ^ 0. By ((271, 

4;( c ) = E/£l(c)v/ |C;f ^ min {l C ;| 2 I 7 G I ( c )}- HenCe 

(29) min{d A/ (c) | c G" AjJ G J s } > mm{\ Cj \ \ j E 1(c)}. 

Since 1 < 1 + s — ||c||o < n — \\c\\o = card({l, . . . ,n} \ 1(c)), there exists a nonempty subset K of 
{1, . . . ,n} \ J(c) with card(K) = s — ||c||o + 1. Let; G 1(c) such that |cy| = min,- eI ( c ) |c,| and set 

(30) /:=(l(c)\{j})UI 

Thenc £ Ajandcard(J) = card(Z(c)) - l + card(X) = ||c|| - 1 + s - ||c|| + l = s. Hence / G J s . 
Because 1(c) \ / = {_/}, it follows again from (|27)) that d A (c) = Cj g j( c ) x ; |q| 2 = |cy| 2 . Therefore 
dj{ (c) = \cj\ = min !Gj ( c ) \cj\, which yields the inequality complementary to ((30|) . ■ 
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Now let x = (xi, x n ) G X, and set 
(31) C s (x) := {/ G J s | min|x ; | > max|x fc |}; 

in other words, / G C s (y) if and only if / contains s indices to the s largest coordinates of x in 
absolute value. 

The proof of the next result is straightforward. 

Lemma 3.5 Let x = (xi,...,x n ) G X such that \\x\\q = card(I(x)) > s, and let J G C s (x). Then 
} C I(x) andmk\j G j \xj\ > mm.j €j r x \ \xj\ > 0. If \\x\\q = card(J(x)) = s, then C s (x) = {I(x)}. 



Projections 

The decomposition of the sparsity set defined by (T25|) yields a natural expression for the projection 
onto this set. 

Proposition 3.6 (Projection onto A and its inverse) Let x = {x\,...,x n ) G X, and define A :— 
{x G X | ||x||o < s}. Then the following hold: 

(i) The distance from x to A is solely determined by C s (x); more precisely, 

' --d A (x), ifJeC s (x); 



(32) 



(V/GJ S ) d Aj (x) 



>d A (x), ifJ£C s (x). 



(ii) The projection of x on A is solely determined by C s (x); more precisely, 
(33) 



P A (x)= |J P A; (x)= |J < 
jeC s (x) jeC{x) 



y = (y 1 ,...,y n ) G X 



(V/G{1 n))y ] 



Xj, if] G J; 
0, if] £ J. 



(iii) (Vy G P A (x)) ||y||o = min{||x|| ,s}. 

(iv) Ifx £ A, then (Vy G P A {x)) I(y) G C s (x) and \\y\\ = s. 

(v) If a G A and \\a\\o = s, then 



(34) 



= \ y = (yi'---'Vn) g x 



(V; G 1(a)) Vj = aj ' 

max < min |fl,|. 

ktl{") /ei(«) , 



(vi) iffl G A and ||a|jo < s, fen P A (a) — a 
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Proof. The following observation will be useful. If / G J s , j G /, and k £ J, then K := (J \ {;'}) U 
{k} G Js and (|27} implies 

(35a) d\ K {x) = £ M 2 = l|x|| 2 - E W 2 = Nl 2 " E M 2 " !^! 2 

Z^K Ze/nK 

(35b) =ll^l| 2 - E k;! 2 -|^! 2 +(l^! 2 -|^| 2 ) 

(35c) = ||x|| 2 - £ N 2 + (|x ; | 2 - K| 2 ) = £ |x,| 2 + (|x ; f - |x fc | 2 ) 

(35d) = d 2 A; (x) + \x j \ 2 -\x k \ z . 



(i) It is clear that 



(36) d A (x) = min [d Aj {x) \]ej s }. 

Let K £ J s and assume that X ^ C s (x). Then there exists 7' and k in {1,. . .,n} such that /c G X, 
;' g X, and |x fc | < |x ; -|. Now define / = (X \ {fc}) U {;'}. Then / 6 J s and 

(37) d 2 K (x) = 4 ; (x) + |x ; f - |x„| 2 > d 2 ; (x) 

by d35b . It follows that index sets in J s \ C s (x) do not contribute to the computation of d A (x). 

Now assume that / and X both belong to C s (x) and that J ^ K. Then card(/ \ X) = card(X \ /). 
Take j G / \ X and A; G X \ /. Since G / G C s (x) and k £ J, we have |x,-| > |x/ c |. On the other 
hand, since k G X G C s (x) and / ^ X, we also have \x k \ > \xA. Altogether, \xj\ = \x k \. Thus 

08a) 4 ; W = NI 2 -EN 2 = NI 2 - E N 2 - E M 2 

Zej Ze/nx Zej\K 

(38b) =||x|| 2 - £ |x,| 2 - £ |xH 2 = ||x|| 2 -Ekzl 2 = 4 K W- 

ZeKn/ ZeK\/ leK 

This completes the proof of fl32l ). 
[(H)] This follows from @3 and gg). 



(iii) Case 2: ||x||o = card(I(x)) < s. Then, by definition, x G A. Thus Pa 00 = x ar *d hence 



||Pa(x)|| = Ikllo = min{||x|| ,s}. 

Case 2: ||x||o = card(J(x)) > s. Let / G C s (x). Lemma 1331 implies min,cj |x ; | > 0. It follows 
from j33l that there exists y = (yi, . . .,y«) G Pa( x ) such that 

(39) (V/G/) |y ; | = |x ; | >0 and (V; £ /) y ; - = 0. 
So 

(40) %) = /, 
and hence ||y||o = card(J) = s = min{card(7(x)),s}. 
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(iv) Let y G Pa( x )- Since x £ A, we have ||x||o > s and hence (iii) implies that ||y||o = s. By 



33]) , there exists / £ C s (x) such that I(y) C /. But card 7(y) = s = card /, and hence I(y) = /. 
(v) Denote the right-hand side of ([34} by R. "D": for every y E R, we have 1(a) G C s (y). By 



l|33)l , a G P^y. Hence y G P A 1 (fl). This establishes R C P a 1 (a). "C": Suppose that y G P^fl), i.e., 
fl G PA(y)- Again by ((33)) , there exists / G C s (y) such that 

(41) (V/G/)«;=y; and (V; £ /) « ; - = 0. 

Since ||a||o = s, Lemma 1331 implies that / = 1(a). Hence, by ((41]) , (V; G 1(a)) yj = aj. On the other 
hand, by definition of C s (y), we have min, e j \yj\ > max^j |y; c |. Altogether, y G K. 



(vi) Lety G P A 1 fl,i.e.,fl G P^y. The hypothesis and (iii) imply s > ||fl||o = min{||y||o,s}, Hence 
o < s; therefore, y E A and so fl = P A y = y. ■ 

Proposition 3.7 (projection onto £>) (See [4, Lemma 4.1].) Recall that B = {x G X | Mx = p}. T/zen 
£/re projection onto B is given by 

(42) P B : X -> X: x x- M\Mx- p), 
where M + denotes the Moore-Penrose inverse of M. 

Normal and tangent cones 

Proposition 3.8 (proximal normal cone to A) 

(43) (VflGA) <° x (fl) = ( (SUPP(fl))± ' ^ fl l' = S; 

[{0}, z/||a|| < s. 



Proof. Combine the definition of N^ ox (a) with Proposition 13 . ^v) | & j (vi) 



The following is a special case of a more general normal cone formulation for the set of matrices 
with rank bounded above by s given in 1(19) . 

Theorem 3.9 (Mordukhovich normal cone to A) 

(44) (VflGA) N A (a) = {u G R" | ||w|| < n - s} D (supp(a)) ± = \J Af. 

I(a)QJeJs 



Consequently, if \\a\\o = s, then N A (a) = (supp(fl)) ± = Aj 



(«)• 



Proof. Let a G A, and let £ G ] 0, min {fly | G 1(a) } [. Let x = (x lr . . . ,x n ) G A fl (a + [-£, +e] n ) . 
Then ||x||o < s and, by Lemma I3.1|fviii)| supp(fl) C supp(x). Hence, using Proposition 13.81 we 



deduce that 



/* r-\ ArP rox, , f(supp(x)) ± , if||x|| = s; 

(45) (X) = \{0}, if||x|| <s - (sUPP(fl)) 
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Note that if ||x||o = s, then (j45l> yields dim(supp(x)) ± = n — s; in either case, 



(46) (Vw G N p A rm {x)) \\u\\ < n 



s. 



Let u G X. We assume first that u G N A (a). Then there exist sequences (xjt)fceN in A n (a + 
[—£,+£]" ) and (ujt))t6N in X sucn that x k -> a > u k — > u , an d (Vfc G N) G N^ rox (xfc). It follows 
from (|45l> , (|46)l , and Lemma 13. 1| lx) | that « G (supp(fl)) ± and ||«||o < n — s. Thus 

(47) N A (fl) C {m G R" I ||u|| < n - s} D (supp(fl))" 1 . 

We now assume that u G (supp(fl)) ± and ||«||o < n — s. Since M G (supp(a)) ± , we have 7(a) n 
I(u) = and hence 1(a) C {1,2,. . . ,n} \ /(«)• Since a G A and card J(w) = |jw||o < n — s, we 
have cardJ(a) < s < card({l,2, . . . ,n) \ 1(h)). Let / G J s such that 1(a) C / C {1,2,. ..,n} \ 
I(u). By Proposition l3.^rv)| M G Aj-. We have established that 

(48) {u G R" | || u || o < n-s} n (supp(fl)) 1 " C |J Ajh 



Finally, assume that w G Aj-, where card J = s and 1(a) C /. Set 
(49) (VeGR ++ )(V;G {1,2,. ..,«}) x £/i := < 



' aj, if; € 1(a); 
e, if ; G / \ /(a); 
otherwise. 



This defines a bounded net (x e ) £e ]o,i[ in X with x £ — » a as £ — >■ 0. Note that (Ve G ]0, 1[) I(x e ) = J; 
hence, x £ G supp(x £ ) = Aj C A and, by Proposition 13.81 « G Aj- = (supp(x £ )) ± = N^ ox (x e ). 
Thus w G Na(ci). We have established the inclusion 

(50) (J Aj- C N A (fl). 

I(«)C/GJ S 



This completes the proof of (|44)) . 

Finally, if ||fl||o = s, then card 1(a) = s and the only choice for / in ((44|) is J(a). ■ 
We now turn to the classical tangent cone of A. 

Definition 3.10 (tangent cone) Let C be a nonempty subset ofX, and let c G C. Then a vector v G X 
belongs to the tangent cone to C at c, denoted Tq(c), if there exist sequences (xjt)fceN zn C and (fjt)/ceN zn 
R ++ such that x^ — > c, t^ — > 0, and (xjt — c) — »■ z;. 

The proof of the following result is elementary and hence omitted. 

Lemma 3.11 Let C be a nonempty subset ofX, let c G C, and assume that (Yk)keK a finite collection of 
affine subspaces such that y G ClkeK ^k '■— Uk^K Yh Then the following hold: 
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(i) (Vp G R++) T c (c) = T (CnbaU(c;)0)) (c). 

(ii) r y (y) = (JfceKPar(Yfc). 

(iii) If each Y/ c is a Zmear subspace, then Ty(y) = Y. 

Lemma 3.12 Lei a = [a.\, . . . ,a n ) G A and suppose that < p < min \aA. Then 

/ei(a) 

(51) ball(a;p) n A = ball(a;p) n (J A 7 . 

Proof. The inclusion "D" is clear. To prove "C", let x G A nball(fl;p). If I(x) ^ 1(a) and 1(a) 
7 (x), then Lemma |3J fix) implies p 2 > ||x — a\\ 2 > min iGl ^ \xj\ 2 + mh\j eI ^ \aj\ 2 > p 2 , which 



is absurd. Therefore, I(x) C 1(a) or 1(a) C I(x). Furthermore, there exists / G J s such that 
I( a ) — I( a ) u ^( x ) — /■ By Proposition [S^py) x G Aj. This completes the proof. 



Corollary 3.13 Let a G A. If s = n, t/zen A z's superregular at a; otherwise, A is superregular at a o 
||fl||o = s. 

Proof Since A = X if s = n, the first statement is clear. We now consider two cases. Case 1: 
||fl|| < s - 1. By (O, 

(52) N A (a)= |J Af. 

l(«)cjej- s 

Since card J(a) < s, Na(a) is therefore the finite union of two or more different linear subspaces of 
X all of the same dimension n — s. Hence Na(a) cannot be convex. On the other hand, N^(a) is 
always convex. Altogether, (a) ^ Na {a) . Thus, by E3l Definition 6.4], A is not Clarke regular 
at a. Hence IT6l Corollary 4.5] implies that A is not superregular at a. 

Case 2: \\a\\o = s. Let p be as in Lemma T3.12I Then Lemma \3. 121 implies that 

(53) ball(a; p) n A = ball(c; p) n A I{a) 

is convex because it is the intersection of a ball and a linear subspace. By (3), Remark 9.2(vii)], A is 
superregular at c. ■ 

Lemma 3.14 Let a G A. Then 

(54) |J Aj = supp(fl) + {x G X | || jsc || o < s - ||fl||o}- 

l(a)c/ej s 



Proof. "C": Let z G Aj, where 1(a) C J e J s . Write / = I(a)UK, where K := / \ 1(a) and the 
union is disjoint. Then z = y + x, where y G Aj( a ) = supp(a), x G A^, and ||x||o < card(K) = 
card(/) — card(I(a)) = s — ||fl||o- 
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"D": Let x G X be such that ||x||o < s — ||a||o, and let y G supp(fl). By Lemma I37T1 l(y) C 1(a), 
I(x + y) C I(x)UJ(y) C l{x) U 1(a) and ||x + y|| < ||x|| + ||y||o < (s- ||«||o) + IMIo = s. Hence, 
there exists J £ J s such that Z(x) U I(fl) C /, and therefore x + y G Au^u^ C A/. ■ 

Theorem 3.15 (tangent cone to A) Let a = (aj, ...,«„) 6 A. T/jen 

(55) Ta(a) = |J Aj = supp(fl) + {x G X | ||x|| < s - ||fl|| }; 

i(a)gjej s 

consequently, 

(56) ||fl|| = s O T A (a) = A I(8 ) = supp(fl). 



Proo/. Set 

(57) p:= min|fly|>0 and A(a) := |J Aj = [j Aj. 
Lemma fe. 11 IT) and Lemma r3.12l imply 

(58) T A (d) = T Anball ( a . p )(fl) = T A (a)ObaR(a;p)( a ) = T A(a)( a )- 

On the other hand, by Lemma I3.11|riii)| TV g )(a) = A(a). Altogether, T A (a) = A(a) and we have 
established the first equality in (|55l> . The second equality is precisely Lemma 13.141 Finally, the 
"consequently" part is clear from (|55)> . ■ 



Remark 3.16 For the affine set B, the normal and tangent cones are much simpler to derive: in- 
deed, because par(B) = kerM, it follows that T B (x) = ker M and N B (x) = (kerM)^ = ranM T , 
for every x G £>. 

Remark 3.17 (transversality) Recall $2$ and assume that c G A n B. By (|55|) , Remark l3.16l and e.g. 
J21 Lemma 1.43(i)], we have the implications 

(59a) 

T A (c) + T B (c) = R w 4» ( |J A 7 J + ker(M) = R" 

V7(c)C/ Gl 7 s / 

(59b) ^ (J (Aj + ker(M)) = R n 

I(c)C/£j s 

(59c) 4» int ( (J (A ; + ker(M)) ] = R" 



(59d) => int (J (A; + ker(M)) = |J int (A; + ker(M)) = R". 
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Let us assume momentarily that T A (c) + T B (c) = R". By (|59|) , there exists J E J s such that 7(c) C / 
and Aj + ker(M) = R". Hence s + dimker(M) = dim Aj + dimker(M) > dim(Aj + ker(M)) = 
dimR" = n = dimker(M) + rank(M). We have established the implication 

(60) T A (c) + T B (c) = R" => s > rank(M); 

that is, transversality imposes a lower bound on s and is thus at odds with the objective of finding 
the sparsest points in A n B. 



The MAP for the sparse feasibility problem 

We begin with an example illustrating shortcomings of previous approaches. 
Example 3.18 Suppose that 

(61) M = (] \ JV p = r\ and s = 1; 

thus, m = 2 and n = 3. Then B = (1, 0, 0) + R(-l, 1, 0) and hence the set of all solutions to © 
consists ofH x* := (1,0,0) and y* := (0,1,0). Since ||x*|| = ||y*|| = s, Theorem 1331 yields 

(62) N A (x*) = {0} x R x R and N A (f) = R x {0} x R. 

On the other hand, (Vx G B) N B (x) = ranM T = span{(l, 1, 1), (1, 1,0)} by Remark|37[6l Alto- 
gether, 

(63) N A (x*)n{-N B (x*)) = N A (y*) n (-N B (y*)) = {0} x {0} xR^ {(0,0,0)}. 

Consequently, neither the Lewis-Luke-Malick framework Ifl6ll nor the framework proposed in [18] 
is able to deal with this case. Furthermore, in view of ((60]) , the transversality condition 

(64) T A (c) + T B (c) =R" 

proposed by Lewis and Malick fl7ll also always fails because s = 1 ^ 2 = rank(M). 

Finally, readers familiar with sparse optimization will also note that the usual sufficient 
conditions for the correspondence of solutions to the nonconvex problem to those of convex 
relaxations — namely the restricted isometry property [9J or the mutual coherence condition |IT3| — 
are not satisfied either. Constraint qualifications as developed in the present work have no appar- 
ent relation to conditions like restricted isometry or mutual coherence conditions used to guar- 
antee the correspondence between solutions to convex surrogate problems and solutions to the 
problem with the original || ■ ||q objective. Indeed, if the matrix M is changed for instance to 



(65) 



111 
12 



the mutual coherence condition is satisfied and a unique sparsest solution exists, but still the 
constraint qualifications ((63)) and ((64]) are not satisfied. 



3 When there is no cause for confusion, we shall write column vectors as row vectors for space reasons. 
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We are now ready for our main result, which is very general and which in particular is applica- 
ble to the setting of Example 13. 181 



Theorem 3.19 (main result for sparse affine feasibility and linear local convergence of MAP) 

Let A, A, AB,B and B be defined by (|2"5|) . Suppose that s < n — 1, that c G A (1 B, and fix S G 1 0, 8 [ for 
1 := ±min {d A] (c) \c G" Aj,J G J s }. Then 

(66) 6 = ±min{|c y | | ;' G 1(c)} 
and 

(67) a = = 3 s(A,£B,B) = max {c{A,,B) \ c G A Jt J G J s ] < 1, 

where 63$, 9, a denote the joint-CQ-number, the limiting joint-CQ-number and the exact joint-CQ-number 
( (l2l , (fl3]> and ((16)) respectively) at c associated with (A,A,B,B). Suppose the starting point of the 

MAP b_i satisfies — c|| < ^^f) ' Then (a^)^^ and (b^)^^ converge linearly to some point in 

—2 

c G A n B n ball(c; S) with rate 9 . 

Proof. Observe that (J66]) follows from Lemma Let / G J s . If c Aj, then ball (c;3S) n Aj = 
and hence ^(A^Aj, £>,£>) = —00. On the other hand, if c G Aj, then c G Aj D B and hence 
03s(Aj, Aj, B, B) = c(Aj, B) < 1 by Theorem |2.61 Combining this with Theorem |2,5;(iv)[ we obtain 
(|67]). Because Aj is a linear subspace and hence convex, Proposition 12.91 yields the (0, +oo)-joint- 
regularity of A; in particular, A is (£>, 0, 35) -joint-regular. Analogously, B = (B) is (A, 0,3^) -joint- 
regular. Now apply Theorem l2.10l to complete the proof. ■ 

Remark 3.20 Some comments regarding Theorem 13 . 1 91 are in order. 

(i) Note that regularity of the intersection is not an assumption of the theorem, but is rather 
automatically satisfied. This is in contrast to the results of (171 and lH6l where the required 
regularity is assumed to hold. In view of Example 13.181 which illustrated that the notions 
of regularity developed in [17] and |H6| are not satisfied, it is clear that Theorem l3.19l is new 
and has a genuinely wider range of applicability. 

(ii) In contrast to [16] and [17|, our analysis yields a quantification of the neighborhood on which 
local linear convergence is guaranteed. 

(iii) Finding the local neighborhood on which linear convergence is guaranteed is not an easy 
task, and may well be tantamount of finding the sparsest solution; however, it does open 
the door to justify combining the MAP with more aggressive algorithms such as Douglas- 
Rachford in order to find such neighborhoods. 

(iv) Consider again Example 13.181 and its notation. Since s = 1, A = (A\, A2, A3), where A, = 
Re„ while B = e 1 + R(e 2 -e t ). Hence c(A t , B) = c(lRe 1 ,'R{e2 -e x )) = \{e x , (e 2 - e x )/y/l) \ = 
l/v^by Theorem|Z6]and Corollary|221 Similarly, c(A 2 ,B) = l/Vl while A 3 n B = 0. Let 
c G {x*,y*}. Then 8 = 1/ \fl. and (|66)) implies that 5 = 1/3. The predicted rate of linear 

convergence is 9 =1/2. 
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(v) The projectors Pa and Pb given by ((33|) and ((42|) are easy to implement numerically 
which we have done. Indeed, for random initial guesses b-\ in the neighborhood 
ball(c;(V2- l)/(18(2\/2 - 1))) the observed ratios || 

a k+l ~ c \\/\\ a k ~ c \\ an d ||fcfc+l ~~ 

c||/||bjt - c|| for a* = P A b fc (k e N, & = Pb6-i) and fe fc = P B a k -i G B (fc G N \ {0}) are 
l/2+|O(10 _13 )|). The observed rate corresponds nicely to the theory under the assumption 
of exact evaluation of the projections. However, exact projections are not in fact computed 
in practice (in particular the projection onto the affine set B), so the numerical illustration is 
not precisely applicable. Inexact alternating projections is beyond the scope of this work. 



Conclusion 

We have applied new tools in variational analysis to the problem of finding sparse vectors in an 
affine subspace. The key tool is the restricted normal cone which generalizes classical normal 
cones. The restricted normal cones are used to define constraint qualifications, and notions of 
regularity that provide sufficient conditions for local convergence of iterates of the elementary 
method of alternating projections applied to the lower level sets of the function || • ||o and an affine 
set. Key ingredients were suitable restricting sets (A and B). The coarsest choice, {A, B) = (X, X), 
recovers the framework by Lewis, Luke, and Malick [16] . We show, however, that the correspond- 
ing regularity conditions are not satisfied in general for the sparse feasibility problem (|2]). The 
tighter (and hence more powerful) choice of {A, B) = (A, B) recovers local linear convergence 
and yields an estimate of the radius of convergence. 
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